perm filename SYMPOS.PUB[L70,TES] blob
sn#025102 filedate 1973-02-13 generic text, type T, neo UTF8
COMMENT ⊗ VALID 00017 PAGES
RECORD PAGE DESCRIPTION
00001 00001
00003 00002 .GROUP SKIP 10
00006 00003 .SEC INTRODUCTION
00012 00004 .SEC REWRITE RULES
00016 00005 .SEC PRIORITY FUNCTIONS
00020 00006 .SEC EXTENSIBILITY
00026 00007 .SEC PORTABILITY
00029 00008 .SEC COMPILATION OF REWRITE TABLES
00035 00009 .SEC STATUS OF IMPLEMENTATION
00037 00010 .APP BACKTRACKING, BACKTRACKING
00043 00011 .SEC NONDETERMINISTIC ALGORITHMS
00049 00012 .APP STREAMING, STREAMING
00053 00013 .APP EQUIVALENTS, ALGORITHMIC EQUIVALENTS
00058 00014 .APP MATCHER, FACILITIES OF THE PATTERN MATCHER
00059 00015 .APP SUBSET, A COMPILER FOR A SUBSET OF LISP70
00060 00016 .APP EXTENSIONS, A LISP70 PROGRAM USING COMPILER EXTENSIONS
00061 00017 .MACRO BOOK(TAG, AUTHOR, TITLE, DESCRIPTION) ⊂
00067 ENDMK
⊗;
.GROUP SKIP 10
.BEGIN CENTER
LISP70: AN EXTENSIBLE AND PORTABLE LANGUAGE
Lawrence G. Tesler
Horace J. Enea
David C. Smith
Stanford University
February, 1973
.END
.MACRO SEC(NAME) ⊂NEXT PAGE; ONCE NOFILL
NAME
.SKIP 3⊃
.MACRO APP(TAG,NAME) ⊂NEXT PAGE; TAG: NEXT APPENDIX!; ONCE NOFILL
APPENDIX λAPPENDIX!}: NAME
.SKIP 3⊃
.MACRO B ⊂SKIP 1; BEGIN INDENT 10; GROUP ; NOFILL⊃
.MACRO LONG ⊂SKIP 1; BEGIN INDENT 10; NOFILL⊃
.MACRO E ⊂END⊃
.MACRO EC ⊂SKIP 1 ; END CONTINUE⊃
.COUNT REF
.COUNT APPENDIX PRINTING "I"
.TURN ON "↓_α#∪", "λ" FOR "{" ;
.NOJUST
.INDENT 5
.EVERY HEADING({PAGE!},,|Tesler, Enea, and Smith|)
.SKIP 6
ABSTRACT
LISP70 is an extensible and portable language. The compiler consists mainly of
tables of rewrite rules. To extend the language, new rules stated in a
user program are added to
these tables by the compiler.
To compile code for different computers, the code generation
tables are replaced completely. The implementation uses backtracking,
coroutines, and streaming to enhance its generality and efficiency.
NOTICE
This is a limited circulation draft of a paper to be submitted to a
conference or journal. No right to publicize its contents is granted.
.SEC INTRODUCTION
The LISP70 language is based on LISP 1.5 [λREF LISP15}],
with the notable addition
of data type facilities, backtracking, pattern-directed computation,
and coroutines. Two standard notations are provided: MLISP [λREF MLISP_ENEA},
λREF MLISP_SMITH}], which is Algol-like, and LISP [λREF LISP}],
which is a fully parenthesized prefix notation.
The language is extensible and portable.
By "extensible language" [λREF SYMPOSIUM}] is meant one which enables the
programmer to define and use new
linguistic capabilities that previously were absent.
Sometimes a compiler
which is so clearly organized and documented that modifications require
little effort is called an "extensible compiler" [λREF BABEL}]. However,
unless the language itself includes facilities to specify modifications to
its compiler or interpreter,
it does not merit the label "extensible language".
Several ways of achieving extensibility are possible.
Some compilers regard a statement as a macro which is expanded
repeatedly until object code is obtained. This view leads to extensible languages
in which the programmer adds facilities by defining new macros
[λREF MCILROY}, ***].
Other compilers are driven by operator precedence tables and/or symbol rewrite
rules. This method leads to extensible languages
in which extensions are made by adding new rules to compiler tables [***].
The LISP70 compiler is driven by "pattern rewrite rules"
[λREF KAY_REWRITE}, λREF MLISP2_EXT}, ***],
a generalization of both macros and symbol rewrite rules.
Extensions to the syntax or control structure
are made by specifying new pattern rewrite rules to be incorporated
into the compiler.
LISP70 uses tables of rewrite rules to translate MLISP statements to LISP,
to translate LISP statements to an "ideal-machine language" called ML,
to translate ML instructions to the LAP assembly language of the object
computer, and to assemble the machine language instructions.
Each rewrite rule is a
pattern-matcher, showing what form of input it will match and what
form of output it will produce should it match an input. New rewrite rules
can be added to a rewrite table at any time, thus extending its domain.
For this reason, a function that is defined as a table of rewrite rules is called an
"extensible function".
Extensible functions are used throughout the LISP70 compiler, and may
also be defined in user programs. For example, in a theorem prover,
a system of inference rules may be defined as a rewrite table, to which
rules may be added or removed at any time.
Some early applications of LISP70 are described in [λREF ENEA_AI}, λREF TESLER_AI}].
A language may be called "portable" if its programs can be executed on computers
having different order codes and architectures.
There are several ways of constructing a compiler for a portable language.
One is to have the compiler study a "manual" for each
object machine and determine the best instructions to generate for each statement
of the language. LISP70 settles for a more modest approach.
For each object machine,
a table of pattern rewrite rules is supplied showing the correspondence
between instructions of the ideal machine and of the object machine.
.SEC REWRITE RULES
A rewrite rule specifies a partial function from input streams to output
streams. Each rule consists of two "patterns": a decomposer (DEC) and
a recomposer (REC) [λREF COLBY2}, λREF COLBY1}, λREF ELIZA}].
A rule can be applied to a particular
input stream only if the DEC pattern "matches" that stream. The
result of applying the rule to the input stream is an output stream
generated by the REC pattern.
Examples:
.B
RULES OF SQUARE =
1 → 1,
2 → 4,
5 → 25 ;
.EC
These three rules define the partial function SQUARE. The call {2}@SQUARE
attempts to find a rule of SQUARE whose DEC pattern matches the input stream
{2}. The second rule (with a DEC of "2" and a REC of "4") works, so the value
of the call is {4}. Curly brackets are used in LISP70 as "stream brackets".
A stream is somewhat like a list which has lost its outer parentheses.
A stream of one element is equal to that element; e.g., {2} = 2 .
.B
RULES OF TIMES =
4 3 → 12,
6 6 → 36,
:X 1 → :X ;
.EC
The symbol :X designates a "variable"; a variable can match any entity in the
input stream, and can echo that entity in the output stream. A variable is
"private" to the rule in which it occurs and can not change its value.
Thus {92 1}@TIMES binds 92 to :X in the third rule of TIMES and returns
the output stream {92}.
Nonterminal symbols may occur in a DEC-pattern in Backus-Naur Form (BNF).
The following compiler rule translates an MLISP "IF-statement" to a LISP "COND":
.B
RULES OF MLISP =
IF <MLISP>:X THEN <MLISP>:Y ELSE <MLISP>:Z
→ (COND (:X :Y) (T :Z)) ;
.EC
The occurrence of "<MLISP>:X" in a DEC-pattern calls MLISP recursively to
match a sub-stream of the input stream, and binds the result of the recursive
call (the LISP translation) to :X.
Other pattern matching facilities are described in Appendix λAPPENDIX MATCHER}.
Pattern matching has become a standard tool of computer science; precedents in
the literature are too numerous to mention.
.SEC PRIORITY FUNCTIONS
It is possible that
the DEC patterns of several rules may all match a given
input stream; all such rules are called "applicable".
The rules of a table are tried in "priority"
order until one is found which is applicable.
The priority order of rules is determined by a "priority function" which can be
specified in the declaration of the table using a BY clause:
.B
RULES OF F BY APPEARANCE =
A :X → A,
:X B → 5,
:X :Y → :Y;
.EC
The priority function APPEARANCE assigns the earlier rules in order of appearance
higher priorities so that they will be tried first.
If no priority function is mentioned in a BY clause, a standard one called
SPECIFICITY is used.
Its effect is to try specific rules before general rules. For example,
it tries a rule whose DEC
contains literals before one whose DEC contains variables and it tries longer DEC
patterns before shorter ones. Rules are compared for SPECIFICITY by a left to
right scan, an appropriate method for ordering the translation rules of a compiler.
The complete definition of SPECIFICITY is beyond the scope of this paper.
The programmer may define and use priority functions better adapted to
the problem at hand. A priority function
is generally defined by a rewrite table which, supplied a pair of LISP-encoded
rewrite rules, tells which has priority.
New rules can be added to an existing rewrite table as in the example below,
which adds to the domain of square both
{17}@SQUARE=289 and {6}@SQUARE={6 6}@TIMES=36 :
.B
RULES OF SQUARE ALSO =
17 → 289,
:N → {:N :N}@TIMES;
.E
The use of the priority function SPECIFICITY often makes it unnecessary to specify
where in the table the new rules should be inserted. However, the use of
priority functions such as APPEARANCE requires such specification.
Facilities for this purpose are provided in LISP70,
as well as facilities for deleting
rules, for generalizing in certain ways, and for scanning and examining
rules.
.SEC EXTENSIBILITY
To extend the syntax of LISP70, rewrite rules are added to the
MLISP-to-LISP translator. New MLISP constructions can be defined in
terms of existing MLISP or LISP constructions.
Alternatively, MLISP can be completely replaced by a different top-level language
such as Algol-60.
The following examples demonstrate
two possible definitions of a "WHERE" facility [λREF ISWIM}]
for MLISP in terms of a LISP "PROG" and in terms of an MLISP "block".
.B
RULES OF MLISP ALSO =
<MLISP>:X WHERE <IDENTIFIER>:V '= <MLISP>:E
→ (PROG (:V) (SETQ :V :E) (RETURN :X)) ;
RULES OF MLISP ALSO =
<MLISP>:X WHERE <IDENTIFIER>:V '= <MLISP>:E
→ { BEGIN
PRIVATE <IDENTIFIER>:V ';
<IDENTIFIER>:V '← <MLISP>:E ';
RETURN <MLISP>:X ';
END } @ MLISP ;
.EC
Although a literal identifier or number is represented by itself in a pattern,
a special character such as "=" must be quoted thus:#'=#.
Applying the above rules to the expression:
.B
A*Y+Y↑2 WHERE Y=F(B)
.EC
results in either the LISP translation:
.B
(PROG (Y)
(SETQ Y (F B))
(RETURN (PLUS (TIMES A Y) (POWER Y 2)))
)
.EC
or the MLISP translation:
.B
{ BEGIN
PRIVATE Y ;
Y ← F(B) ;
RETURN A*Y+Y↑2 ;
END } @ MLISP
.EC
which calls MLISP recursively to translate the block to the LISP "PROG" shown above.
A complete compiler for a subset of LISP70 is presented in
Appendix λAPPENDIX SUBSET}, and examples of extensions to that compiler are
presented in Appendix λAPPENDIX EXTENSIONS}.
Extensions to a rewrite table may be made temporarily during
the compilation of a particular program block. This provides an approach
to structured programming wherein a programming problem not only entails the
solution of subproblems by procedures but also entails
the description of solutions
by appropriately defined language additions.
This approach has been in use by the authors and others since 1971
using MLISP2 [λREF MLISP2_EXT}], the predecessor to LISP70.
It has the advantage of making programs
shorter and easier to read and write, once a suitable language has been designed.
It has proved to be surprisingly easy to make extensions when needed. The major
disadvantage is that each programmer tends to develop a personal notation and
data representation which can be confusing to others.
Extensibility can be used to "tune" the compiler to a given problem or class
of problems. Once the critical loop of a program is identified, the programmer
can write special rewrite rules at various levels of the translator
to generate better code at the cost of slightly slower compilation.
The semantics of LISP70 can be extended as well as its syntax.
New rules can be
added to the LISP-to-ML translator, permitting the
definition of new LISP primitives such as block structure
and context mechanisms.
New rules can also be added to the ML-to-LAP translator,
permitting the addition of facilities that did not previously exist on the ideal
machine.
Alternatively, the ML-to-LAP translator may be completely replaced by code generators
for a different computer. This provides a degree of "portability".
.SEC PORTABILITY
It is possible to implement a multi-language multi-computer compiler
organized as in the following diagram:
.B
.TURN OFF "↓"
MLISP QA-4 ALGOL-60 other
↓ ↓ ↓ ↓
↓ ↓ ↓ ↓
*---------------*------*----------*------------*
↓
LISP
↓
*----------------------*------*
↓ ↓
↓ ML
↓ ↓
*-------------*--*--------------*
↓ ↓ ↓ ↓
↓ ↓ ↓
↓
B1700 PDP-10 IBM 360 other
.EC
For example, ALGOL-60 could be translated to an appropriately extended LISP,
the LISP could be translated to ML, and ML could be translated to the
machine code of the IBM 360. In the case of a microprogrammed
machine like the B1700, or in the case of an interpreter-based implementation,
the LISP code could be executed directly without going through ML.
The use of the intermediate machine-independent language ML might appear to
decrease the efficiency of the object code and of the compiler.
Brown [λREF LOWL}] has shown that with proper design a penalty of only
2.5% to 6% in extra object instructions need be paid for the intermediate
use of a low-level language. LISP70 further enhances efficiency by use of
a machine-dependent peephole optimizer in critical portions of a program.
To minimize
compiling overhead, LISP70 "pipelines" the translation steps as described
later in this paper.
Any LISP70 implementation can generate code for any computer whose code
generators and elementary i/o routines are defined in machine-dependent
code. The LISP70
intrinsic functions are
all programmed in MLISP to maximize the machine independence of the system.
.SEC COMPILATION OF REWRITE TABLES
The semantics of rewrite rules do not require that decomposition and recomposition
proceed from left-to-right,
nor even that decomposition finish before
recomposition begins. The implementor is allowed
some latitude in choosing matching and generation
algorithms. Due to the single-assignment nature of the private
variables in a rewrite rule,
parallel hardware often can be used to advantage if the
input and output streams are represented by random-access structures
[λREF COMPEL}]. Left
recursion difficulties can be avoided by right-to-left processing, by look-ahead,
or by iteration, and special cases can be recognized by the compiler and optimized.
In most instances, strictly left-to-right top-down processing
with backtracking is both sufficient
and easiest to implement. If the compiler determines that this is the correct
approach, it can translate each rewrite rule to an equivalent algorithm using the
"streaming" facilities of LISP70.
When streaming is employed,
the input stream or "source" is pointed at by a "source
pointer" and the output stream or "sink" by a "sink pointer".
The DEC pattern works from
left to right advancing the source pointer as it matches the input,
and the REC pattern
works from left to right advancing the sink pointer as it generates the output.
The source and sink can be any sequential data structure
or process. Alternatively, the source can be an argument list passed to the
extensible function and the sink can be the result returned by the function.
It is possible to set up a pipeline of coroutined
processes as diagrammed below [cf. λREF KAY_PIPE}]:
.B
----- ----- ----- ----- --------- ------
|MLISP| |MLISP| |LISP | | ML | | | | |
|text |→| -to-|→|-to- |→|-to- |→| LAP |→| code |
|file | |LISP | | ML | | LAP | |assembler| |vector|
| | |tran.| |tran.| |tran.| | | | |
----- ----- ----- ----- --------- ------
.EC
The source of each translator is the adjacent process or file on its left, and its
sink is the adjacent process or vector on its right. By streaming data directly
from process to process, most intermediate results can be stored on stacks and the
high cost of list-structure garbage is avoided. It is largely by this means that
the penalty for using a four-step translation is minimized.
To increase efficiency, backtracking can be reduced by factoring common left-hand
items out of several rules:
.B
RULES OF G =
A B C → C,
A B :X → T,
A R → 4,
D R → 5,
:I :X 6 → 16,
:X :Y → 2,
:X :Y :Z → 3;
.EC
Here, A can be factored out of the first three rules and A B out of the first two.
{:X :Y} can be factored out of the last two rules. Notice that systematic renaming
of the variables of a rule does not change its effect;
thus, {:X :Y} can be factored
out of the fifth rule after renaming :I to :X and :X to :Y.
Hashing can be used to advantage after factoring DEC patterns. If several
rules are identical up to a certain point and their first difference is different
literals in the same position, then a hash can be used to choose the applicable
rule. In the example
above, the choice between the rules that start with "A" and with
"D" can be made by a hash. Of course, such optimization is more helpful when
more than two different literals are involved.
The backtracking facility of LISP70 is discussed
at length in Appendix λAPPENDIX BACKTRACKING},
the streaming facility in Appendix λAPPENDIX STREAMING},
and examples of algorithmic equivalents
to rewrite rules in Appendix λAPPENDIX EQUIVALENTS}.
.SEC STATUS OF IMPLEMENTATION
LISP70 has both specific and general purposes.
The kernel language, which is already
implemented on the PDP-10 [*], includes just enough facilities to compile
the compiler and execute standard LISP jobs.
The kernel is a general purpose language, extensible in many directions
to tailor it to different applications, to tune it to different problems, and to
implement it on different machines.
An expanded version of LISP70 is under development, intended specifically as a
language for artificial intelligence, natural language processing, and
simulation of cognitive processes. To suit these purposes, such facilities
as gremlins and
asynchronous processes will be included.
.SKIP 5
[*] STATUS Feb. 1, 1973: very buggy; not available to users.
.APP BACKTRACKING, BACKTRACKING
NONDETERMINISM
.SKIP 3
The state of a LISP process is completely defined at any moment by
the bindings of variables (ALIST), the properties of atoms (OBLIST),
the Program Counter or Point of Control (PC), and input-output pointers.
It is not hard to make a LISP system write out the entire state of
a running process by printing the values of these items. One can also
conceive of stacking the state of a LISP process on a "State Stack" and
after running awhile, restoring one of the former states that has been
saved on the State Stack. Proceeding from the restored PC would then
cause an exact repetition of the previous action of the process. This
state-saving feature is useful for debugging [λREF PLANNER_BACK}]
and will be provided in LISP70.
State-saving is useful not only in debugging but also in problem-solving.
In particular, it is useful in the implementation of "non-deterministic"
algorithms. A non-deterministic algorithm performs actions which
are not determined solely by past events but partially by future events.
Compare the Algol statement below with the accompanying statement that
might appear in a mythical non-deterministic language:
.B
if x>10 then y ← y + 1 ;
if later x > 10 then y ← y + 1 ;
.EC
In the first case, the decision whether to step y is determined completely
by past events, because the decision is based on the current value of x.
In the second case, the decision whether to step y is not determined
completely by past events, because the decision is based on a future value
of x.
There are several ways of implementing the second statement on a computer.
One way is to create two identical processes, in one of which
it is assumed that x will be greater than 10 and in the other
that it will not. When x finally receives a value, the process operating
under the incorrect assumption can be eliminated. This method is often called
the "multiple world" method, and is conceptually simplest.
Another method avoids creating multiple processes. The current state
is saved on a state stack and the program continues on the arbitrary assumption
that x will turn out to be 10 or smaller.
When x finally receives a value, if the
guess was correct, the stacked state can be discarded. If the
guess turns out to be incorrect, the current state can be discarded,
the stacked state restored, and the computation reiterated based on the
corrected assumption. This is often called the "backtracking method".
It can be more economical of space then the multiple world method, because
only one process at a time is active.
The backtracking method has some disadvantages as compared with the multiple
world method. If some of the processes created by nondeterminism are non-terminating,
the multiple world method can eliminate them from the processor as soon as
some alternate process terminates, but the backtracking
method may loop endlessly because it only eliminates a process
once a guess is proven to be incorrect.
A further disadvantage is that facilities to communicate the progress of one
process to another are inherently limited by the sequential manner of trying
alternatives.
.SEC NONDETERMINISTIC ALGORITHMS
Floyd [λREF FLOYD}]
suggested three primitive operations that are sufficient to specify any
non-deterministic algorithm.
The operations are called "choice", "success", and "failure".
"Choice" is a
function of one integer argument and returns one positive integer value
less than or equal to the argument.
For example, the call "I ← CHOICE(3)" sets I to either 1, 2, or 3;
in a multiple world implementation,
three processes are created that are identical except
for the value of I. Any of these processes in which the function FAILURE() is
called is terminated unsuccessfully and eliminated from the processor.
Any process in which the
function SUCCESS() is called
is terminated successfully; at that time, all other
unfinished processes may be eliminated if desired.
It is possible for a process spawned by a CHOICE to call CHOICE again and
thus be replaced by several descendant processes. If the multiple world
method is employed, a
large number of processes may be created after several generations,
demanding excessive space
in a conventional computer. To reduce the space required, Floyd recommended
a backtracking method.
A call on CHOICE stacks the current state
and returns only the value 1. Whenever FAILURE() is called,
the current state is discarded and replaced by the top state on the
stack, only this time CHOICE returns a value one greater then it returned
last time, e.g., 2 instead of 1. If failures
exhaust all N values of CHOICE(N), then the top state on the stack is
discarded and the CHOICE waiting in the next stacked state is handled.
Even the backtracking method would demand excessive storage if the entire
ALIST and OBLIST were stacked at every call on CHOICE. Instead, a practical
implementation only stacks the difference between successive
states. This is called "incremental state saving".
Thus, if the current state is S4 and stacked states are S3, S2, and
S1, then all that needs to really be stacked are S4-S3, S3-S2, and S2-S1.
FAILURE() merely changes the current state S4 according to the state difference
S4-S3 in order to restore the state S3.
To maintain a correct record of
state differences takes a little bit of bookkeeping every time the state
is changed. Floyd's paper described an elegant method of
accomplishing this. Unfortunately, the overhead of bookkeeping can become
uncomfortably expensive of both time and state stack space.
Some languages [λREF PLANNER_BACK}, λREF ECL_BACK}] reduce this
overhead by requiring the programmer to analyze the program and
designate which state changes need to
be recorded. In MLISP2 [λREF MLISP2_BACK}],
the programmer is relieved of this burden:
the system performs most of the analysis automatically.
Frequently changed components of the state, such as the local variables of
the current function, are saved only at calls
on CHOICE instead of at every change. This is
called "frequency-sensitive incremental state saving".
LISP70 utilizes the frequency-sensitive method. At failures,
all portions of the state are automatically restored
except those specifically exempted by the programmer.
At successes, the state stack is emptied. "Partial successes" are also
available to prune selected unneeded states from the stack.
.APP STREAMING, STREAMING
Much of the processing performed by computers is sequential, including
serial input-output, list scanning, and character string parsing. To
simplify the description of sequential processing, LISP70 features a
"streaming" facility. Any sequential data structure (such as a text file,
a list, or a string) may be treated as a "source" or as a "sink". A
"source pointer" may be pointed at any element of a source, and a
"sink pointer" may be pointed at the last element of a sink. Data may be
"streamed" from a source beginning at the element designated by the
source pointer, and data may be "streamed" to the end of a sink.
The primitives important in streaming are the functions SOURCE_POINTER,
SINK_POINTER, THIS, and NEXT.
If S is a sequential data structure, then SOURCE_POINTER(S, E) produces a
source pointer pointing to the first element of S, while SINK_POINTER(S, E)
produces a sink
pointer pointing to the last element of S. The argument E is used to define
the nature of an "element". For example, if S is a text file, it may
be regarded as a series of characters, as a series of tokens, or as a
series of S-expressions. If the argument E is omitted, then the elements are
assumed to be S-expressions.
If SP is any pointer, then THIS(SP) returns the element at which SP points.
If SP is a source pointer in particular,
then NEXT(SP) returns THIS(SP) and advances SP one
element to the right.
If SP is a sink pointer, then NEXT(SP)←X lengthens the sink by one element,
makes SP point to the new last element, and makes that element be X.
If SP points past the end of a source, then THIS(SP) returns the special
value EOF (end of stream) and NEXT(SP) calls FAILURE().
If SP points to an empty sink, then THIS(SP) returns the special value BOF
(beginning of stream).
A pointer may be pointed at a process. In that case, NEXT activates the
process until it outputs a "source element" or accepts a "sink element".
Backtracking restores the positions of all pointers and the states of all
sources and sinks except those specifically exempted by the programmer.
The following MLISP program copies the file A to the end of the list L, omitting all
instances of the number 4 from the copy.
It uses the predicate SUCCEEDS, which returns FALSE
if the evaluation of its argument
calls FAILURE(), and otherwise returns non-FALSE:
.B
BEGIN
POINTER S, D ;
S ← SOURCE_POINTER(A) ; D ← SINK_POINTER(L) ;
WHILE SUCCEEDS(X ← NEXT(S)) DO IF X≠4 THEN NEXT(D) ← X ;
END
.E
.APP EQUIVALENTS, ALGORITHMIC EQUIVALENTS
The algorithmic equivalent of the rewrite rule:
.B
:Y U :X → :X 4 :Y
.EC
is the MLISP statement:
.B
BEGIN
PRIVATE X, Y ;
Y ← NEXT(SOURCE) ;
IF NEXT(SOURCE) ≠ 'U THEN FAILURE() ;
X ← NEXT(SOURCE) ;
NEXT(SINK) ← X ;
NEXT(SINK) ← 4 ;
NEXT(SINK) ← Y ;
END
.E
The following example demonstrates factoring common left-hand
items out of several rewrite rules:
.B
RULES OF G =
A B C → C,
A B :X → T,
A R → 4,
D R → 5,
:I :X 6 → 16,
:X :Y → 2,
:X :Y :Z → 3;
.EC
Its algorithmic equivalent is:
.B
FUNCTION G =
BEGIN
PRIVATE X, Y, Z ;
CASE CHOICE(3) OF
BEGIN
IF NEXT(SOURCE) = 'A THEN
IF NEXT(SOURCE) = 'B THEN
CASE CHOICE(2) OF
BEGIN
IF NEXT(SOURCE) = 'C THEN
NEXT(SINK) ← 'C
ELSE FAILURE() ;
BEGIN
X ← NEXT(SOURCE) ;
NEXT(SINK) ← 'T
END
END
ELSE IF NEXT(SOURCE) = 'R THEN
NEXT(SINK) ← 4
ELSE FAILURE()
ELSE FAILURE() ;
IF NEXT(SOURCE) = 'D ∧ NEXT(SOURCE) = 'R THEN
NEXT(SINK) ← 5
ELSE FAILURE() ;
BEGIN
X ← NEXT(SOURCE) ;
Y ← NEXT(SOURCE) ;
CASE CHOICE(3) OF
BEGIN
IF NEXT(SOURCE) = 6 THEN
NEXT(SINK) ← 16
ELSE FAILURE() ;
BEGIN
Z ← NEXT(SOURCE) ;
NEXT(SINK) ← 3 ;
END ;
NEXT(SINK) ← 2
END
END ;
END
.EC
The values of X, Y, and Z are never used in this
example, and an optimizing compiler would not assign or even declare them.
LISP70
provides various ways of calling an extensible function, depending on whether
or not the entire source must be exhausted during the
match and whether the absence
of any applicable rule should cause a failure or an error condition.
The actual algorithmic equivalent of a rewrite table must include additional
tests based on these modes.
Upon FAILURE, it is possible to back up to a CHOICE within some rewrite
table. The effect is as if its last chosen applicable rule had failed;
the next rule in priority order is tried instead. To inhibit this feature,
it is possible to specify that a rule is "preemptive" by doubling
its arrow (→→).
If one of the applicable rules that is tried is a preemptive rule and it
leads to failure,
then no other rules of the rewrite table will be tried on the input stream
even though they apply.
When extensions are made to a rewrite table, each new rule is translated to its
algorithmic equivalent in a special LISP notation
and merged into the existing equivalent
algorithm, guided by the priority function to determine ordering, factoring,
and hashing opportunities. The object code is marked "invalid"
so that the next attempt to execute it will cause an interrupt. At that time
the latest LISP version will be compiled
and the resulting object code will be marked "valid" and
executed.
Thus, recompilation is deferred until actually required
[cf. λREF MITCHELL}].
Future versions of the
system will be able to recompile just those portions of the table which
are affected by the extensions.
.APP MATCHER, FACILITIES OF THE PATTERN MATCHER
segmenting, side-conditions, conjunctive match, disjunctive match,
non-match, repetition, evaluation, look-ahead, look-behind, and reversible rules.
.APP SUBSET, A COMPILER FOR A SUBSET OF LISP70
.APP EXTENSIONS, A LISP70 PROGRAM USING COMPILER EXTENSIONS
.MACRO BOOK(TAG, AUTHOR, TITLE, DESCRIPTION) ⊂
.SEND REFERENCES ⊂}<TAG≤AUTHOR≥↓_TITLE_↓, DESCRIPTION>λ⊃⊃
.
.MACRO ARTICLE(TAG, AUTHOR, TITLE, DESCRIPTION) ⊂
.SEND REFERENCES
. ⊂}<TAG≤AUTHOR≥"TITLE", DESCRIPTION>λ⊃⊃
.
.ARTICLE(ENEA_AI, |Enea, H., Colby, K. M., and Moravec, H.|,
.Idiolectic Language-Analysis for Understanding Doctor-Patient Dialogues,
.|(submitted to IJCAI)|)
.
.ARTICLE(TESLER_AI, |Tesler, L., Enea, H., and Smith, D.|,
.LISP70: A Successor to LISP 1,
.|(submitted to IJCAI)|)
.
.ARTICLE(MLISP_ENEA, |Enea, H.|, |MLISP (IBM 360/67)|,
.|Computer Science Technical Report CS 92, Stanford University, 1968|)
.
.ARTICLE(MLISP_SMITH, |Smith, D.|, |MLISP (PDP-10)|,
.|Artificial Intelligence Memo No. 135, Stanford University, Oct. 1970|)
.
.BOOK(LISP15, |McCarthy, J. et al|, LISP 1.5 Programmer's Manual, |MIT Press, 1962|)
.
.ARTICLE(LISP, |McCarthy, J.|,
.|Recursive Functions of Symbolic Expressions and their Computation by
. Machine, Part I|, |Comm.#ACM ∪3, 4, (April#1960), 184-195|)
.
.ARTICLE(COLBY1, |Colby, K. M., Watt, J., and Gilbet, J. P.|,
.A Computer Method of Psychotherapy,
.|J. of Nervous and Mental Disease ∪142, 1966, 148-152|)
.
.ARTICLE(COLBY2, |Colby, K. M. and Enea, H.|,
.|Heuristic Methods for Computer Understanding of Natural Language in Context
. Restricted On-Line Dialogues|,
.|Math. Biosciences ∪1, 1967, 1-25|)
.
.ARTICLE(ELIZA, |Weizenbaum, J.|,
.|ELIZA -- A computer Program for the Study of Natural Communication Between
. Man and Machine|, |Comm.#ACM, ∪9, 1, (Jan.#1966), 36-45|)
.
.ARTICLE(MCILROY, |McIlroy, M. Douglas|,
.|Macro Instruction Extension of Compiler Languages|,
.|Comm.#ACM, ∪3, 4, (April#1960), 214-220|)
.
.ARTICLE(SYMPOSIUM, |Schuman, S., ed.|,
.Proceedings of the International Symposium on Extensible Languages,
.|ACM SIGPLAN Notices, ∪6, 12, (Dec.#1971)|)
.
.ARTICLE(MLISP2_EXT, |Smith, D. and Enea, H.|,
.MLISP2 -- A Programming Language for Writing and Debugging Translators,
.|(submitted to IJCAI)|)
.
.ARTICLE(MLISP2_BACK, |Smith, D. and Enea, H.|,
.Backtracking in MLISP2, |(submitted to IJCAI)|)
.
.ARTICLE(LOWL, |Brown, P. J.|, Levels of Language for Portable Software,
.|Comm.#ACM, ∪15, 12, (Dec.#1972), 1059-1062|)
.
.ARTICLE(COMPEL, |Tesler, L. and Enea, H.|, A Language Design for Concurrent Processes,
.|Proc.#SJCC, ∪32, 1968, 403-408, AFIPS Press|)
.
.ARTICLE(FLOYD, |Floyd, R. W.|, Nondeterministic Algorithms,
.|J.#ACM, ∪14, 4, (Oct.#1967), 636-644|)
.
.ARTICLE(PLANNER_BACK, |Hewitt, C.|, Procedural Embedding of Knowledge in PLANNER,
.|Proc.#IJCAI, ∪2, 1969, 295-301|)
.
.ARTICLE(ECL_BACK, |Prenner, C. J., Spitzen, Jay M., and Wegbreit, B.|,
.An Implementation of Backtracking for Programming Languages,
.|ACM SIGPLAN Notices ∪7, 1, (Nov. 1972), 36-44|)
.
.ARTICLE(BABEL, |Scowen, R. S.|, An Application of Extensible Compilers,
.|in [λREF SYMPOSIUM}], pp.1-7|)
.
.ARTICLE(KAY_REWRITE, |Kay, A.|, **, **)
.
.ARTICLE(KAY_PIPE, |Kay, A.|, **, **)
.
.BOOK(MITCHELL, |Mitchell, J.|,
.|The Design and Construction of Flexible and Efficient
.Interactive Programming Systems|,
.|PhD. Thesis, Carnegie Mellon University, June 1970|)
.
.PORTION REFERENCES
.ONCE CENTER
↓_REFERENCES_↓
.SKIP 4
.AT "<" TAG "≤" AUTHOR "≥" DESCRIPTION ">" ⊂
.TAG: NEXT REF;
.BEGIN GROUP
[λREF}] AUTHOR, DESCRIPTION
.END ⊃
.INDENT 0,5
.RECEIVE "≤≥"